Today’s Agenda

  • Getting to know each other
  • Topological Data Analysis (TDA)
    • Why and How
  • The problem of shape reconstruction
  • Shape reconstruction techniques
    • Vietoris–Rips complexes, Persistent Homology
  • Opportunities
  • Questions

About Me

Tulane University, New Orleans

PhD in Mathematics

UC Berkeley, California

Data Science Postdoc Researcher

George Washington University, DC

Assistant Professor of Data Science

Get in Touch

  • Website: smajhi.com
  • Email: s.majhi@gwu.edu

Topological Data Analysis

What is Data Science?

  • What is Data Science?

    It’s the art of learning meaningful patterns from data.

  • What is Topological Data Analysis (TDA)?

    TDA is a subfield of data science that focuses on the systematic learning of geometric and topological patterns of data.

Contributing Fields

  • Computational Geomtry

    geometric algorithms to compute convex hulls, geometric complexes, polytopes

  • Computational Topology

    algorithms to compute various topological invariants—e.g., homology groups, Betti numbers, Euler characteristic—of a finite-enough topological space

  • Applied Topology

    apply topological ideas in sensor network, motion planning in robotics configuration spaces, etc.

Shape of Data

A sample

Learn Geometric Patterns / Features

A Reconstruction

  • The topology of data is the wedge of two circles (two \(1\)-dimensional cycles)
  • The geometry of the data is graph-like

Success of TDA

  • Data inherit an intrinsic topological structure in many applications
  • Topology-inspired methods are robust to small perturbations (stability)
  • Easy to mitigate the curse of dimensionality (scalability)
  • Geometry- and topology-aware deep learning methods facilitate easy interpretation (intuitive)
    • geometricdeeplearning.com
  • Free, open-source software like Giotto-tda, Gudhi

Particular Application Areas

  • Shape Reconstruction

    • Manifold: Majhi (2023a)
    • Graph: Majhi (2023b)
    • More general spaces: Komendarczyk, Majhi, and Tran (2024)
  • Riemannian manifold learning

    • Niyogi, Smale, and Weinberger (2008)
  • Dimensionality reduction

    • Fefferman Fefferman, Mitter, and Narayanan (2016)
  • Detection of Financial Market Crash

  • Predicting Indian Monsoon

  • Deep Learning Pipelines

    • topological deep learning

The Problem of Shape Reconstruction

Good Shapes

Circle

Donut

Bad Shapes

IIT Mandi North Campus

A Real Application

The City of Berlin

Q: How to draw the map of the city from a noisy point-cloud of GPS locations?

A Sample Reconstruction

Source: mapconstruction.org

The Mathematical Formulation

  • Shape: A Shape is modeled as a metric space \((M,d_M)\).

    • Riemannian manifold
    • metric graph
    • general compact set
  • Sample: A finite metric space \((X,d_X)\) close to \(M\).

    • small Hausdorff proximity if \(M\) is a Euclidean submanifold and \(X\subset\mathbb R^d\)
    • alternatively, small Gromov–Hausdorff distance.

  • Goal: Infer the topology of \(M\) from \(X\).

    • Estimate only the Betti numbers—number of connected components, cycles, voids, etc—of \(M\).

    • construct a topological space \(\widetilde{M}\) from \(X\) to retain the topology of \(M\), i.e., \(M\simeq\widetilde{M}\) in some appropriate sense.

      • homotopy equivalent
      • homeomorphic
      • geometrically close

Vietoris–Rips Complex

  • a metric space \((X,d_X)\)

  • a scale \(\beta>0\)

  • \(R_\beta(X)\) is an abstract simplicial complex such that

    • each subset \(A\subset X\) of size \(k\) with diameter at most \(\beta\) is a \((k-1)\)-simplex.

The Theme of My Research

  • Design provable reconstruction algorithms particularly for bad shapes
  • Provide a window of resolution or scale for the Vietoris–Rips complexes for an accurate reconstruction
  • Probabilic guarantees in the face of random samples
  • Further the theoretical understanding and computational aspects of the density conditions, e.g., Gromov–Hausdorff distances

Limitations of TDA

  • TDA is theory-heavy with a steep learning curve
  • The community is strong but not as vast
  • Not applicable to data without any instrinsic geometry.
  • We need more open-source software implementation of TDA tools.

Opportunities

  • TDA

Questions

Thank you

References

Fefferman, Charles, Sanjoy Mitter, and Hariharan Narayanan. 2016. “Testing the Manifold Hypothesis.” Journal of the American Mathematical Society 29 (4): 983–1049. https://doi.org/10.1090/jams/852.
Komendarczyk, Rafal, Sushovan Majhi, and Will Tran. 2024. “Topological Stability and Latschev-Type Reconstruction Theorems for \(\boldsymbol{\mathrm{CAT}(κ)}\) Spaces.” arXiv. https://doi.org/10.48550/ARXIV.2406.04259.
Majhi, Sushovan. 2023a. “Demystifying Latschev’s Theorem: Manifold Reconstruction from Noisy Data.”
arXiv:2204.14234 [Math.AT]
. https://doi.org/10.48550/ARXIV.2305.17288.
———. 2023b. “VietorisRips Complexes of Metric Spaces Near a Metric Graph.” Journal of Applied and Computational Topology, May. https://doi.org/10.1007/s41468-023-00122-z.
Niyogi, Partha, Stephen Smale, and Shmuel Weinberger. 2008. “Finding the Homology of Submanifolds with High Confidence from Random Samples.” Discrete And Computational Geometry 39. 1-3: 419–41.
Rai, Anish, Buddha Nath Sharma, Salam Rabindrajit Luwang, Md Nurujjaman, and Sushovan Majhi. 2024. “Identifying Extreme Events in the Stock Market: A Topological Data Analysis.” Chaos 34 (10).